The data set referenced in this script is generated from the American Community Survey (ACS) and the Washington Office of Superintendent of Public Instruction (OSPI). These data sets provide data at the person-level, with the ability to look at the different indicators by the six equity demographic groups of interest.
This data set was compiled from PUMS data.
Looking at the fields in the data set## [1] "Disability_cat" "Income_cat" "LEP_cat" "Older_cat"
## [5] "POC_cat" "Youth_cat" "Total"
## [1] "educational_attainment" "healthcare_coverage"
## [3] "median_household_income" "household_poverty"
## [5] "median_gross_rent" "crowding"
## [7] "SNAP" "internet_access"
## [9] "Kindergarten readiness" "tenure"
## [11] "rent_burden"
## [1] 2021 2022 2019 2018 2017 2016 2015 2014 2013 2012 2011
In this section we make sure that the data set makes sense.
## [1] "King" "Kitsap" "Pierce" "Snohomish" "Region"
## [1] "Income_cat" "LEP_cat" "Disability_cat" "POC_cat"
These fields will vary by indicator:
## [1] "share"
## [1] 2022 2021 2019 2018 2017 2016 2015 2014 2013 2012 2011
## [1] "6 for 6 dimensions"
There are 5 geographies and 4 equity focus groups (each with 2 subgroups). There are 11 years in the data set and the indicator specific field has 1 attribute(s), which means there should be a total of 440 rows.
## [1] 1260
There are some missing
data.
If we look at the data by year and geography, there should be 8 entries per year/geography.
##
## King Kitsap Pierce Region Snohomish
## 2011 24 6 24 24 18
## 2012 24 6 24 24 24
## 2013 24 15 24 24 24
## 2014 24 15 24 24 24
## 2015 24 24 24 24 24
## 2016 24 24 24 24 24
## 2017 24 24 24 24 24
## 2018 24 24 24 24 24
## 2019 24 24 24 24 24
## 2021 24 24 24 24 24
## 2022 24 24 24 24 24
Kitsap (2011-2014), Snohomish (2011),
2020
If we look at the data by year and focus group, there should be 10 entries per year/focus group.
##
## Disability_cat Income_cat LEP_cat POC_cat
## 2011 21 24 27 24
## 2012 24 27 27 24
## 2013 24 30 27 30
## 2014 24 30 27 30
## 2015 30 30 30 30
## 2016 30 30 30 30
## 2017 30 30 30 30
## 2018 30 30 30 30
## 2019 30 30 30 30
## 2021 30 30 30 30
## 2022 30 30 30 30
If we look at the data by year and focus sub-group, there should be 5 entries per year/focus sub-group.
##
## English proficient Limited English proficiency Low Income Non-Low Income
## 2011 15 12 12 12
## 2012 15 12 12 15
## 2013 15 12 15 15
## 2014 15 12 15 15
## 2015 15 15 15 15
## 2016 15 15 15 15
## 2017 15 15 15 15
## 2018 15 15 15 15
## 2019 15 15 15 15
## 2021 15 15 15 15
## 2022 15 15 15 15
##
## Non-POC POC With disability Without disability
## 2011 12 12 9 12
## 2012 12 12 12 12
## 2013 15 15 12 12
## 2014 15 15 12 12
## 2015 15 15 15 15
## 2016 15 15 15 15
## 2017 15 15 15 15
## 2018 15 15 15 15
## 2019 15 15 15 15
## 2021 15 15 15 15
## 2022 15 15 15 15
If we look at the data by year and indicator attribute, there should be 40 entries per year/indicator attribute.
##
## 6 for 6 dimensions
## 2011 96
## 2012 102
## 2013 111
## 2014 111
## 2015 120
## 2016 120
## 2017 120
## 2018 120
## 2019 120
## 2021 120
## 2022 120
To check for 0s and NULLs
There are no nulls.
To look at distribution of all data - not the most useful visual, but
provides a sense of the range of values at a high level in one
plot.
In this section we start to explore the data visually -
distribution by the different dimensions within the data set. These
plots are helpful to check for outliers and get a higher level
understanding of the data in one visual, before slicing the data by
geography and equity focus group in the following sections.
The following code will need to be adjusted to fit the fields specific
to the data indicator. For educational attainment, we focus on those
with a Bachelor’s degree or higher. The following code establishes the
data frame that the rest of the analysis uses. If there are fewer than 2
indicator attributes, this section can be skipped/commented
out, but the code will need to be adjusted throughout.
This section isn’t relevant for this specific indicator because there aren’t unique indicator attributes.
In this section we explore trends by different groups with MOEs. These charts help to show any missing data by geography, year, or focus group/subgroup.
In this section we further develop the draft visuals for communicating the results and supporting the narrative for the Equity Tracker webpages. These charts are slightly more refined by slicing the data by geography and equity focus group. The line charts don’t include MOEs, but they help make connections between the same groups over time.
The 5 geographies are all included in the facets by geography, but they could be separated out to create 5 individual charts - one for each geography.
The 6 equity focus groups are all included in the facets by geography, but they could be separated out to create 6 individual charts - one for each focus group.
Resource for
visual
The code to make this is type of visual is long - adjust to indicator as
needed (scale_x_continuous, labs, label,
etc).
This section needs to be edited. Keep the code chunks commented out for now as we draft and refine the visuals.
This section includes visuals that were determined to be less useful. We didn’t want to lose the work, but didn’t want to include it in the main workflow. Feel free to comment out if you don’t want to adjust the arguments to fit the indicator of interest.
There are five charts for the different geographies: Region and the 4
counties.
There are 5 charts for the different geographies: Region and the 4 counties.
There are 5 charts for the different geographies: Region and the 4 counties.
There are 6 charts for the different equity groups: POC, low-income, etc.
There are 6 charts for the different equity groups: POC, low-income, etc.